A global database on coral recovery following marine heatwaves

Coral reefs support the world’s most diverse marine ecosystem and provide invaluable goods and services for millions of people worldwide. They are however experiencing frequent and intensive marine heatwaves that are causing coral bleaching and mortality. Coarse-grained climate models predict that few coral reefs will survive the 3 °C sea-surface temperature rise in the coming century. Yet, field studies show localized pockets of coral survival and recovery even under high-temperature conditions. Quantifying recovery from marine heatwaves is central to making accurate predictions of coral-reef trajectories into the near future. Here we introduce the world’s most comprehensive database on coral recovery following marine heatwaves and other disturbances, called Heatwaves and Coral-Recovery Database (HeatCRD) encompassing 29,205 data records spanning 44 years from 12,266 sites, 83 countries, and 160 data sources. These data provide essential information to coral-reef scientists and managers to best guide coral-reef conservation efforts at both local and regional scales.


Background & Summary
The intensity and frequency of anomalously high ocean temperatures have increased over the past four decades 1,2 .Such marine heatwaves have been particularly evident on coral reefs, globally [3][4][5] .High ocean temperatures lead to coral bleaching, coral mortality, and changes in coral assemblages.Many recent studies have focused on coral bleaching as the immediate effect of heatwaves at oceanic scales [4][5][6] but only a few studies (see for example Gonzalez-Barrios et al. 7 ) have focused on coral recovery.Likewise, several recent databases have addressed oceanic scale coral bleaching 8 and coral cover [9][10][11] that complement the work presented in this paper.The dynamics and trajectories of corals are dependent on a suite of parameters, including the intensity of the heatwave, the geographical location and depth of the site, how much coral remains following the heatwave, the composition of the community, how quickly the corals grow, and the extent of subsequent recruitment following the heatwave.Heatwaves and Coral-Recovery Database (HeatCRD) 1 fulfills an urgent need to compile data following marine heatwaves at oceanic scales to determine (i) how rapidly coral reefs recover from marine heatwaves, (ii) to what extent recovery varies geographically 12 , and (iii) which local conditions influence recovery rates?
Most field studies on coral reefs estimate the percentage of coral cover, which is the two-dimensional coverage that corals occupy across a coral-reef substrate.The primary data presented here is the percentage of total coral cover at a study site over time.A study site is a unique latitude-longitude coordinate point at a given depth.
To date, we have 29,205 data records for 12,266 sites, over 4 decades from 1977 to 2020 (Fig. 1, Supplementary Table 1) 1 .There are two main data sources in the database, the first being a compilation of data from established monitoring programs (73%), and the second being new data extracted from the literature (27%).All time-series datasets have been checked at multiple levels and are quality-controlled.The database also contains environmental variables at each site, including site exposure to waves, distance to land, level of protection from fishing, habitat type, mean turbidity, and a suite of sea-surface temperature metrics at the time of the survey.

Methods
To date, we have coral data for 12,266 sites, from 83 countries, from 1977 to 2020 (Fig. 1; Supplementary Table 1) 1 .The Heatwaves and Coral Recovery Database (HeatCRD) is available as a Microsoft Access database file and as a SQLite database file 1 , the latter of which is directly accessible through R. Examples of the R code that extract data from the SQLite files, which are ready for data analysis, are provided as "DB_Querying_RSQLite.R".Data in the HeatCRD are stored in 15 related Tables (see Fig. 2, Schematic of the database structure).Some geographical regions have more comprehensive time-series data than other regions, for example, the Great Barrier Reef, in Australia, and the Florida Keys, USA, have the most comprehensive time-series data (Fig. 1).
The primary geographical variable in the HeatCRD is a 'site' on a reef, recorded as a latitude and longitude coordinate.The static locality data (i.e., latitudinal and longitudinal coordinates, distance to land, and exposure) are stored in the Table "Site_Description_tbl".A site can have multiple sampling events (Fig. 1) (i.e., multiple depths and/or multiple dates sampled), and these temporal events are stored separately in the Table "Sample_ Event_tbl".Data corresponding to these sampling events are stored in two related Tables: "Cover_tbl" (% hard coral cover) and "Thermal_Stress_tbl". Tables with enumerated lists are used to ensure integrity in naming conventions -such Tables are denoted with "LUT" (look-up-table).Data in the HeatCRD are stored in 15 related Tables (Fig. 2).
Coral-cover data were extracted from the primary literature using WebPlotDigitizer 13 .Sampling points that fell on land or were >1 km from any coral reef were removed.If sites were not named or given explicit coordinates, the coordinates were estimated and a comment was added to the data table.The coordinates were entered into Google Earth and the location names, distance to land in meters, and exposure were determined and recorded for each site.Exposure to waves was based on a site's potential exposure to predominate winds, swell, and fetch (i.e., the extent of open ocean).Mean turbidity (Kd 490 ) was added for each site 14 ; although turbidity is technically the suspended particles in the water column, K d represents changes in water clarity from both particles, dissolved materials, and the water itself.We used the term turbidity in previous publications and therefore we will remain consistent in the present study.The Marine Ecoregions of the World (MEOW) shapefiles 15 and IUCN's (International Union for Conservation of Nature) World Database on Protected Areas 16 were used to determine in which marine realm and protected area each site was located.Veron's ecoregions 17 shapefiles were used to determine the ecoregion of each site.Data on the types of reef habitats were extracted from the Allen Coral Atlas 18 .
Normalization.If the site coordinates were not already in decimal degrees in the original data, they were converted to decimal degrees in the HeatCRD.Latitude and longitude coordinates were determined with Google Earth when coordinate information was not explicitly provided in the text of the published papers.The Coral Reef Temperature Anomaly Database (CoRTAD version 6) 19 , which is a collection of sea surface temperature variables,  was used to extract temperature metrics for each sampling event.CoRTAD values were only extracted for a sampling event if sampled data had a clearly defined month and year -where sampling events were missing a date, the 15 th day of the month was used.For any data given as a range (i.e., depth or date), the midpoint was taken and a comment was added to the HeatCRD.

Fig. 1
Fig. 1 The global HeatCRD (Heatwaves and Coral-Recovery Database) study sites (n = 12,266) 1 .The colors signify the number of repeated surveys (from 1 -26) at each study site, from 83 countries from 1977 to 2020.

Fig. 2
Fig. 2 Schematic of the global HeatCRD (Heatwaves and Coral-Recovery Database) 1 showing the relationships among the 15 Tables.